A Novel Pair-wise Language Detection Approach using Convolutional Neural Network Specifically Targeting Bangla and English
A K M Shahariar Azad Rabby, Md. Majedul Islam, Nazmul Hasan, Jebun Nahar, Fuad Rahman
Accepted to be presented at THE IEEE 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 1st - 3rd July, 2020, IIT, Kharagpur, India
Description
Language detection is an essential pre-processing
step in the implementations of many multilingual documentprocessing solutions, such as Optical Character Recognition
(OCR) and machine translation. Specifically, language
detection research for Bangla is very rare, with only a handful
of solutions ever reported in the literature. In this paper, we
present a lightweight, small footprint convolutional neural
network, which detects Bangla and English languages—
directly from scanned mixed-language document images. The
proposed model achieves 99.98% recognition accuracy for this
specific two-language classification problem.